The Spike-and-Slab LASSO
نویسندگان
چکیده
Despite the wide adoption of spike-and-slab methodology for Bayesian variable selection, its potential for penalized likelihood estimation has largely been overlooked. In this paper, we bridge this gap by cross-fertilizing these two paradigms with the Spike-and-Slab LASSO procedure for variable selection and parameter estimation in linear regression. We introduce a new class of self-adaptive penalty functions that arise from a fully Bayes spike-and-slab formulation, ultimately moving beyond the separable penalty framework. A virtue of these non-separable penalties is their ability to borrow strength across coordinates, adapt to ensemble sparsity information and exert multiplicity adjustment. The Spike-and-Slab LASSO procedure harvests efficient Bayesian EM and coordinate-wise implementations with a pathfollowing scheme for dynamic posterior exploration. We show on simulated data that the fully Bayes penalty mimics oracle performance, providing a viable alternative to cross-validation. We develop theory for the separable and non-separable variants of the penalty, showing rateoptimality of the global mode as well as optimal posterior concentration when p > n. Thus, the modal estimates can be supplemented with meaningful uncertainty assessments.
منابع مشابه
Generalized spike-and-slab priors for Bayesian group feature selection using expectation propagation
We describe a Bayesian method for group feature selection in linear regression problems. The method is based on a generalized version of the standard spike-and-slab prior distribution which is often used for individual feature selection. Exact Bayesian inference under the prior considered is infeasible for typical regression problems. However, approximate inference can be carried out efficientl...
متن کاملThe Spike-and-Slab Lasso Generalized Linear Models for Prediction and Associated Genes Detection.
Large-scale "omics" data have been increasingly used as an important resource for prognostic prediction of diseases and detection of associated genes. However, there are considerable challenges in analyzing high-dimensional molecular data, including the large number of potential molecular predictors, limited number of samples, and small effect of each predictor. We propose new Bayesian hierarch...
متن کاملCombining a relaxed EM algorithm with Occam's razor for Bayesian variable selection in high-dimensional regression
We address the problem of Bayesian variable selection for high-dimensional linear regression. We consider a generative model that uses a spike-and-slab-like prior distribution obtained by multiplying a deterministic binary vector, which traduces the sparsity of the problem, with a random Gaussian parameter vector. The originality of the work is to consider inference through relaxing the model a...
متن کاملFast Bayesian Factor Analysis via Automatic Rotations to Sparsity
Rotational transformations have traditionally played a key role in enhancing the interpretability of factor analysis via post-hoc modifications of the factor model orientation. Regularization methods also serve to achieve this goal by prioritizing sparse loading matrices. In this work, we cross-fertilize these two paradigms within a unifying Bayesian framework. Our approach deploys intermediate...
متن کاملStructured Recurrent Temporal Restricted Boltzmann Machines
The recurrent temporal restricted Boltzmann machine (RTRBM) is a probabilistic time-series model. The topology of the RTRBM graphical model, however, assumes full connectivity between all the pairs of visible units and hidden units, thereby ignoring the dependency structure within the observations. Learning this structure has the potential for not only improving the prediction performance, but ...
متن کامل